A Turbo-Decoding Weighted Forward-Backward Algorithm for Multimodal Speech Recognition

نویسندگان

  • Simon Receveur
  • David Scheler
  • Tim Fingscheidt
چکیده

Since the performance of automatic speech recognition (ASR) still degrades under adverse acoustic conditions, recognition robustness can be improved by incorporating further modalities. The arising question of information fusion shows interesting parallels to problems in digital communications, where the turbo principle revolutionized reliable communication. In this paper, we examine whether the immense gains obtained in communications could also probably be achieved in the field of ASR, since decoding algorithms are often practically the same: Viterbi algorithm, or forward-backward algorithm (FBA). First, we show that an ASR turbo recognition scheme can be implemented within the classical FBA framework by modifying the observation likelihoods only; second, we extend our solution to a generalized turbo ASR approach, which is fully applicable to multimodal ASR. Applied to an audio-visual speech recognition task, our proposed method clearly outperforms a conventional coupled hidden-Markov model approach as well as an iterative state-of-the-art approach with up to 32.3% relative reduction in word error rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Turbo Decoders for Audio-Visual Continuous Speech Recognition

Visual speech, i.e., video recordings of speakers’ mouths, plays an important role in improving the robustness properties of automatic speech recognition (ASR) against noise. Optimal fusion of audio and video modalities is still one of the major challenges that attracts significant interest in the realm of audiovisual ASR. Recently, turbo decoders (TDs) have been successful in addressing the au...

متن کامل

An Iterative Decoding Approach to Document Image Analysis

We introduce an iterative approach to recognizing two-dimensional grammatical structure within digital images, which we term “turbo recognition.” Inspired by the success of turbo decoding for channel coding of one-dimensional sequences, we develop a recognition scheme for images based on two independent views of the same underlying message. These correspond to two independent image sources, one...

متن کامل

Decoding with Finite-State Transducers on GPUs

Weighted finite automata and transducers (including hidden Markov models and conditional random fields) are widely used in natural language processing (NLP) to perform tasks such as morphological analysis, part-of-speech tagging, chunking, named entity recognition, speech recognition, and others. Parallelizing finite state algorithms on graphics processing units (GPUs) would benefit many areas ...

متن کامل

Concurrent Turbo - Decoding

The turbo-decoding algorithm can be viewed as a message passing procedure on a graph. We contrast the standard 6‘forward-backward’’ turbodecoding algorithm with a new “concurrent” algorithm that is suited to parallel implementation.

متن کامل

Efficient Methods for Automatic Speech Recognition

This thesis presents work in the area of automatic speech recognition (ASR). The thesis focuses on methods for increasing the efficiency of speech recognition systems and on techniques for efficient representation of different types of knowledge in the decoding process. In this work, several decoding algorithms and recognition systems have been developed, aimed at various recognition tasks. The...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013